AITopics

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada (0.04)
Europe > Spain (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Neural Information Processing SystemsDec-24-2025, 23:21:39 GMT

Relation-Constrained Decoding for Text Generation

The dominant paradigm for neural text generation nowadays is seq2seq learning with large-scale pretrained language models. However, it is usually difficult to manually constrain the generation process of these models. Prior studies have introduced Lexically Constrained Decoding (LCD) to ensure the presence of pre-specified words or phrases in the output. However, simply applying lexical constraints has no guarantee of the grammatical or semantic relations between words. Thus, more elaborate constraints are needed. To this end, we first propose a new constrained decoding scenario named Relation-Constrained Decoding (RCD), which requires the model's output to contain several given word pairs with respect to the given relations between them.

name change, relation, relation-constrained decoding, (6 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Neural Information Processing SystemsOct-2-2025, 05:11:15 GMT

Visualizing and Measuring the Geometry of BERT

Emily Reif, Ann Yuan, Martin Wattenberg, Fernanda B. Viegas, Andy Coenen, Adam Pearce, Been Kim

Transformer architectures show significant promise for natural language processing.

artificial intelligence, machine learning, natural language, (18 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Kabir, Md Ahsanul, Jahin, Abrar, Hasan, Mohammad Al

Extracting Cause-Effect Pairs from a Sentence with a Dependency-Aware Transformer Model

arXiv.org Artificial IntelligenceJul-15-2025

Extracting cause and effect phrases from a sentence is an important NLP task, with numerous applications in various domains, including legal, medical, education, and scientific research. There are many unsupervised and supervised methods proposed for solving this task. Among these, unsupervised methods utilize various linguistic tools, including syntactic patterns, dependency tree, dependency relations, etc. among different sentential units for extracting the cause and effect phrases. On the other hand, the contemporary supervised methods use various deep learning based mask language models equipped with a token classification layer for extracting cause and effect phrases. Linguistic tools, specifically, dependency tree, which organizes a sentence into different semantic units have been shown to be very effective for extracting semantic pairs from a sentence, but existing supervised methods do not have any provision for utilizing such tools within their model framework. In this work, we propose DepBERT, which extends a transformer-based model by incorporating dependency tree of a sentence within the model framework. Extensive experiments over three datasets show that DepBERT is better than various state-of-the art supervised causality extraction methods.

large language model, machine learning, natural language, (20 more...)

2507.09925

Country:

Europe (0.68)
North America > United States > Indiana (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJun-10-2025

Extending dependencies to the taggedPBC: Word order in transitive clauses

Ring, Hiram

The taggedPBC (Ring 2025a) contains more than 1,800 sentences of pos-tagged parallel text data from over 1,500 languages, representing 133 language families and 111 isolates. While this dwarfs previously available resources, and the POS tags achieve decent accuracy, allowing for predictive crosslinguistic insights (Ring 2025b), the dataset was not initially annotated for dependencies. This paper reports on a CoNLLU-formatted version of the dataset which transfers dependency information along with POS tags to all languages in the taggedPBC. Although there are various concerns regarding the quality of the tags and the dependencies, word order information derived from this dataset regarding the position of arguments and predicates in transitive clauses correlates with expert determinations of word order in three typological databases (WALS, Grambank, Autotyp). This highlights the usefulness of corpus-based typological approaches (as per Baylor et al. 2023; Bjerva 2024) for extending comparisons of discrete linguistic categories, and suggests that important insights can be gained even from noisy data, given sufficient annotation. The dependency-annotated corpora are also made available for research and collaboration via GitHub.

artificial intelligence, dependency, natural language, (17 more...)

2506.06785

Country:

Asia (0.46)
Europe (0.46)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.88)

Zeldes, Amir, Speransky, Nina, Wagner, Nicholas, Schroeder, Caroline T.

A UD Treebank for Bohairic Coptic

arXiv.org Artificial IntelligenceJun-10-2025

Despite recent advances in digital resources for other Coptic dialects, especially Sahidic, Bohairic Coptic, the main Coptic dialect for pre-Mamluk, late Byzantine Egypt, and the contemporary language of the Coptic Church, remains critically under-resourced. This paper presents and evaluates the first syntactically annotated corpus of Bohairic Coptic, sampling data from a range of works, including Biblical text, saints' lives and Christian ascetic writing. We also explore some of the main differences we observe compared to the existing UD treebank of Sahidic Coptic, the classical dialect of the language, and conduct joint and cross-dialect parsing experiments, revealing the unique nature of Bohairic as a related, but distinct variety from the more often studied Sahidic.

artificial intelligence, natural language, treebank, (17 more...)

2504.18386

Country:

North America > United States (0.68)
Europe > Germany (0.68)
Africa > Middle East > Egypt (0.49)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Kandala, Ratna, Mondal, Prakash

A Unified Representation for Continuity and Discontinuity: Syntactic and Computational Motivations

arXiv.org Artificial IntelligenceJun-9-2025

The correspondence principle is proposed to enable a unified representation of the representational principles from PSG, DG, and CG . To that end, the paper first illustrates a series of steps in achieving a unified representation for a discontinuous subordinate clause from Turkish as an illustrative case. This affords a new way of approach ing discontinuity in natural language from a theoretical point of view that unites and integrates the basic tenets of PSG, DG, and CG, with significant consequences for syntactic analysis. The n this paper demonstrates that a unified representation can simplify computational complexity with regards to the neurocognitive representation and processing of both continuous and discontinuous sentences vis - à - vis the basic principles of PSG, DG, and CG. 1 Introduction Discontinuity refers to a case of non - adjacency when a predicate and its argument (s) are not adjacent as per the linear order of the sentence -- predicate structure here may apply to constituents such as verb phrases, noun phrases, adjective phrases, etc. It is typically observed in free word order languages including Australian languages such as W arlpiri, Jiwarli, Turkish (Hale, 1982, 1983; Nordlinger, 2014). Figure 1 depicts a schematic representation of continuity and discontinuity.

artificial intelligence, natural language, relation, (16 more...)

2506.05686

Country:

Europe (0.93)
Asia (0.67)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.28)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.93)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Kandala, Ratna, Mondal, Prakash

Towards a Unified System of Representation for Continuity and Discontinuity in Natural Language

arXiv.org Artificial IntelligenceJun-6-2025

Syntactic discontinuity is a grammatical phenomenon in which a constituent is split into more than one part because of the insertion of an element which is not part of the constituent. This is observed in many languages across the world such as Turkish, Russian, Japanese, Warlpiri, Navajo, Hopi, Dyirbal, Yidiny etc. Different formalisms/frameworks in current linguistic theory approach the problem of discontinuous structures in different ways. Each framework/formalism has widely been viewed as an independent and non-converging system of analysis. In this paper, we propose a unified system of representation for both continuity and discontinuity in structures of natural languages by taking into account three formalisms, in particular, Phrase Structure Grammar (PSG) for its widely used notion of constituency, Dependency Grammar (DG) for its head-dependent relations, and Categorial Grammar (CG) for its focus on functor-argument relations. We attempt to show that discontinuous expressions as well as continuous structures can be analysed through a unified mathematical derivation incorporating the representations of linguistic structure in these three grammar formalisms.

artificial intelligence, natural language, relation, (18 more...)

2506.05235

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(13 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Neural Information Processing SystemsJan-18-2025, 12:28:21 GMT

Relation-Constrained Decoding for Text Generation

relation, relation-constrained decoding, text generation, (4 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Chen, Ping, Alo, Richard, Rundell, Justin

From Language To Vision: A Case Study of Text Animation

arXiv.org Artificial IntelligenceJan-5-2025

Information can be expressed in multiple formats including natural language, images, and motions. Human intelligence usually faces little difficulty to convert from one format to another format, which often shows a true understanding of encoded information. Moreover, such conversions have broad application in many real-world applications. In this paper, we present a text visualization system that can visualize free text with animations. Our system is illustrated by visualizing example sentences of elementary Physics laws.

artificial intelligence, machine learning, natural language, (20 more...)

2501.02549

Country:

North America > United States > Texas (0.14)
North America > United States > New Jersey (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)